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in Proc. of the 3rd Workshop on Algorithmic Learning Theory, pp. 208-219, 1992. 
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1. N. Abe, K. Yamanishi, A. Nakamura, H. Mamitsuka, J.Takeuchi, & H. Li: 
"Distributed and Active Learning," The Foundations of Real- World Intelligence, Oct. 
2001. 

2. J. Takeuchi & K. Yamanishi: 

" Statistical outlier detection in data mining " (in Japanese), Bulletin of the Japan Society 
for Industrial and Applied Mathnematics , Vol. 10, No. 3, 2001. 

3. J. Takeuchi: 

"Asymptotically Minimax Codes by Bayes Procedures" (in Japanese), in Proa oflEICE 
Society Conference, October 1998. 

4. J. Takeuchi: 

" Stochastic complexity and Jeffreys mixture prediction strategies " (in Japanese), in 
Proc. of the first Workshop on Information Based Induction Sciences, pp. 9-16, 1998. 

5. A. R. Barron & J. Takeuchi: 

"Mixture models achieving optimal coding regret," in Proc. of 1998 IEEE Inform. 
Theory Workshop, 1998. 

Other Conference Papers (selected) 

1. J. Takeuchi & A. R. Barron: 

" Robustly minimax codes for universal data compression ", in Proc, of the 21st 
Symposium on Information Theory and its Applications (SITA '98), 1998. 

2. J. Takeuchi & A. R. Barron: 

" Asymptotically minimax regret for exponential families ", in Proc. of the 20th 
Symposium on Information Theory and its Applications (SITA'97), pp. 665-668, 1997. 
Best papers award at SITA'97. 

3. J. Takeuchi & S. Amari: 

" The alpha-parallel prior and its properties " (in Japanese), in Technical Report of 
IEICE , IT26-20, pp. 61-66, 1996. 

4. J. Takeuchi & K. Kawabata: 

" On data compression algorithms by Bayes coding for Markov sources " (in Japanese), 
in Proc. of the 1 7th Symposium on Information Theory and its Applications (SITA '94) 9 
pp.513-516, 1994. 
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• D at a Mining and Machine Learning 

• S patial Reasonin g 

• Databases and Legacy Systems 
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Data Mining and Machine Learning 



Outlier Detection Using Replicator Neural Networks 

Hongxing He 5 Simon Hawkins, Graham Williams, Rohan Baxter 
DaWaK2002 

Mining Temporal Patterns from Health Care Data 

Weiqiang Lin, Mehmut Orgun, Graham Williams 
DaWaK2002 

Feature Selection for Pathology Laboratory Monitoring 

Simon Hawkins, Graham Williams, Rohan Baxter 

Topics in Health Information Management, Volume 22 Number 1 August 2001 Pages 14-23 

The Pathology Explorer 

Graham Williams and Simon Hawkins 

CMIS Technical Report Number 01/116 

Report in Confidence to the Health Insurance Commission 

Advances in Knowledge Discovery and Data Mining 

David Cheung, Graham J. Williams, Qing Li 

5th Pacific Asia Conference , PAKDD 2001, Hong Kong, China, April 2001, Proceedings. 
Lecture Notes in Artificial Intelligence, Volume 2035, Springer. 
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Feature Selection for Temporal Health Records 

Rohan A. Baxter, Graham J. Williams, Hongxing He 

in Advances in Knowledge Discovery and Data Mining, 

Editted by David Cheung, Graham J. Williams, Qing Li 

Lecture Notes in Artificial Intelligence, Volume 2035, Springer, April 200L 

Proceedings of the 5th Pacific Asia Conference on Knowledge Discovery and Data Mining 

PAKDD 2001, Hong Kong, China. 

Temporal Data Mining Using Hidden Markov-Local Polynomial Models 

Weiqian Lin, Mehmet A. Orgim, Graham J. Williams 

in Advances in Knowledge Discovery and Data Mining, 

Editted by David Cheung, Graham J. Williams, Qing Li 

Lecture Notes in Artificial Intelligence, Volume 2035, Springer, April 2001. 

Proceedings of the 5th Pacific Asia Conference on Knowledge Discovery and Data Mining 

PAKDD 2001, Hong Kong, China. 

Data Mining of Administrative Claims Data for Pathology Services 

Simon Hawkins, Graham Williams, Rohan Baxter, Peter Christen, Michael Fett, Markus Hegland, 
Fuchun Huang, Ole Nielsen, Tatiana Semanova, Andrew Smith 

Hawaii International Conference on System Sciences (HICSS-35), Data Mining in Health, 
January 2001, Hawaii, USA 

Temporal data mining using multi-level local polynomial models 

Weiqiang Lin, Mehmet A. Orgun, Grhama J. Williams 

In Second International Conference on Intelligent Data Engineering and Automated Learning 
Lecture Notes in Computer Science, Springer, December 2000. 
IDEAL 2000, Hong Kong. 

On-line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning 



Kenji Yamanishi, Jun-ichi Takeuchi, Graham Williams, Peter Milne 

Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and 
Data Mining 

KDD-01, August 20-23, 2000, Boston, MA USA 
Applications of Artificial Intelligence in Industry 

Dickson Lukose, Graham Williams (editors) 

Proceedings of the Symposium on the Application of Artificial Intelligence in Industry. 
Melbourne, Australia, August 2000 
ISBN 0730027937 
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Mining Taxation Data with Parallel BMARS 

[pdf es] 



Sergey Bakin, Markus Hegland, and Graham Williams 
Parallel Algorithms and Applications 
Voll5,pp 37-55, May 2000 

The Integrated Delivery of Large-Scale Data Mining: The ACSys Data Mining Project 

Graham Williams, Man Altas, Sergey Bakin, Peter Christen, Markus Hegland, 
Alonso Marquez, Peter Milne, Rajehndra Nagappan, and Stephen Roberts 
In Large-Scale Parallel Data Mining, State-of-the-Art Survey 
Editted by Mohammed J. Zaki and Ching-Tien Ho 
Lecture Notes in Artificial Intelligence, Volume 1759 
Springer- Verlag, 2000 

Data Mining Tools 

Irfan Altas, Sergey Bakin, Markus Hegland, Stephen Roberts, Berwin Turlach, and Graham 
Williams 

IEEE Transactions on Concurrency 
Submitted 1999 

An Overiew of ACSys Data Mining 

Graham J. Williams 

Computational Techniques and Applications Conference and Workshops 
(CTAC99) 

Canberra, September 1999 
Integrated Delivery of Data Mining 

Graham J. Williams 

KDD'99 Workshop on Large-Scale Parallel KDD Systems 
San Diego, August 1999 

Evolving Interestingness for Data Mining 

Graham J. Williams 

Third Pacific-Asia Conference on Knowledge Discovery and Data Mining 
Beijing, April 1999 

Data Mining Tutorial 

Graham J. Williams 
SEAL'98 

Canberra, November 1998 
Evolvolutionary Techniques in Data Mining Interestingness 
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Graham J. Williams 

Workshop on Evolutionary Computation 
Canberra, October 1998 

The Data Miner's Arcade: Pluggable Data Mining 

Graham J. Williams 
Technical Report 
May 1998. 

Abstract 

The Data Miner's Arcade is a Java-based environment for data mining. It implements an Object- 
Oriented model for the Data Mining process, with standard interfaces for accessing data and for 
delivering results. By developing standards, new tools can plug into the environment with a 
minimum of effort, providing "Plug-n-Play' opportunities with new tools as they become 
available. Data can be accessed from Database systems through ODBC and JDBC, or from other 
sources and managed internally within the Arcade. The Extensible Markup Language (XML) is 
used as the target "language" for all Data Mining tools within the environment. The Predictive 
Modelling Markup Language (PMML) developed by UIC is an example of the XML markup that 
the system handles. Data Mining tools produce as their output documents that conform to PMML. 
These can then be visualised, run, or combined with other models as appropriate, all within The 
Data Miner's Arcade environment. 

To What Extent can Data Mining be Proceduralised 

Graham J. Williams 
Panel Discussion 

Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98) 
Melbourne, April 1998. 

High Performance Data Managment Issues in Data Mining 

Graham J. Williams 

Presented to the Workshop on Parallel and Distributed Data Mining 
Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98) 
Melbourne, April 1998. 

Mining the Knowledge Mine: The Hot Spots Methodology for Mining Large Real World 
Databases 

[pdf ps ps.gz ] 

Graham J. Williams and Zhexue Huang 
in Advanced Topics in Artificial Intelligence 
Lecture Notes in Artificial Intelligence 
Volume 1342, Pages 340-348 
Springer- Verlag, 1997 



Abstract 
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As databases grow in size and complexity the task of adding value to the wealth of data becomes 
difficult. Data mining has emerged as the technology to add value to enormous databases by 
finding new and important snippets (or nuggets) of knowledge. With large training sets, however, 
extremely large collections of nuggets are being extracted, leading to much "fools gold" amongst 
which to fossick for the real gold. Attention is now being directed towards the problem of how to 
better focus on the most precious nuggets. This paper presents the hot spots methodology, 
adopting a multi-strategy and interactive approach to help focus on the important nuggets. The 
methodology first performs data mining and then explores the resulting models to find the 
important nuggets contained therein. This approach is demonstrated in insurance and fraud 
applications. 

PEPNet: Parallel Evolutionary Programming for Constructing Artificial Neural Networks 

[pdf ps ps.gz ] , 

Gerrit A. Riessen, Graham J. Williams, and Xin Yao 

Sixth Annual Conference On Evolutionary Programming (EP97) 

Indianapolis 



This paper presents a description of an evolutionary artificial neural network algorithm, EPNet 
and its extension taking advantage of a High Performance Computing Environment. PEPNet, 
Parallel EPNet, implements four forms of parallelism and this paper describes two of those 
parallelisms. Experimental studies have shown promising results with better time and prediction 
performance. 

A Case Study in Knowledge Acquisition for Insurance Risk Assessment using a KDD 
Methodology 

[pdf ps ps.gz ] 

Graham J. Williams and Zhexue Huang 

Pacific Rim Knowledge Acquisition Workshop (PKAW96) 

Sydney 



We describe some initial experiences in dealing with the task of acquiring knowledge where a 
very large collection of case histories is available. A Knowledge Discovery in Databases (KDD) 
approach is taken. KDD is the process of extracting novel information and knowledge from large 
databases, consisting of many interacting stages performing specific data manipulation and 
transformation operations with an information flow from one stage onto the next (and usually with 
feedback into previous stages). We characterise our experiences of this process for the task of 
acquiring knowledge for the domain of motor vehicle insurance premium setting for NRMA 
Insurance Limited. 

Parallel Decision Tree Induction 

Graham J. Williams 

CSIRO DITData Mining Technical Report TR-DM-96024 



Abstract 



Abstract 
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Abstract 



Knowledge discovery in databases (or KDD) and it's associated data mining technologies are 
making enormous demands on traditional machine learning and statistical algorithms. KDD often 
deals with extremely large databases, often in sizes measured in terms of gigabytes rather than 
megabytes. Traditional machine learning and statustical techniques begin to be strecthed beyond 
their capabilities when the data sizes reach many thousands of records. In this paper I review our 
work in dealing with very large datasets in the context of traditional decision tree induction 
algorithms (ID3 and C4.5). MIL (Williams 1988, Williams 1990), for Multiple Inductive 
Learning, is a system for inducing multiple decision trees in parallel, transforming those trees to 
rules, and then intelligently merging the resulting rule sets into a unified knowledge base. Our 
efforts to parallelise the decision tree induction algorithm for the Fujitsu AP1000 and the 
Thinking Machine Corporation's CM-5 high performance copmuters are also reviewed. 

PEPNet: Parallel Evolutionary Artificial Neural Networks (Poster ) 
[pdf ps ps.gz ] 

Gerrit Riessen, Xin Yao, Zhexue Huang, Peter Milne, and Graham Williams 
Fifth Australian Conference on Neural Networks (ACNN96) 



Artificial Neural Networks (ANNs) provide an important classification tool for Knowledge 
Discovery in Databases (KDD). Unfortunately ANNs require considerable time to train, 
particularly when large datasets are involved. Training time is also adversely affected when the 
characteristics of the dataset are not consistent with the structure of the ANN. In developing 
ANNs there are no hard and fast rules for determining the structure of the network. Evolutionary 
Artificial Neural Networks (EANNs) take advantage of evolutionary search techniques to address 
some of the problems associated with developing optimal ANNs. EANNs dynamically modify the 
structure of the ANNs on the basis of performance. EPNet (Yao and Liu 1996) is a serial 
algorithm which adopts these ideas to produce efficient ANNs. Such techniques produce greater 
accuracy in the networks, however at the expense of extra computational and storage 
requirements. Our work focuses on PEANNs, Parallel Evolutionary Artificial Neural Networks. 
PEANNs have the potential to produce accurate networks in significantly less time than serial 
EANNs using larger datasets. A parallel implementation of EPNet, called PEPNet, is being 
developed to explore this hypothesis. 

KDD for Insurance Risk Assessment 

Graham J. Williams and Zhexue Huang 
March 1996 

CSIRO DITData Mining Technical Report TR-DM-96014 



Insurance is a business of risks. Identifying and understanding areas of risk is an important task 
performed by an insurer. An assessment of risk is used to set the appropriate premium for 
insurance policies. This paper describes a KDD exercise which uses decision tree techniques to 
identify significant areas of risk within an insurance portfolio. The real world dataset used 
contains information about policies and insurance claims on those policies. Decision trees can be 



Abstract 



Abstract 
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constructed to identify and describe areas of high risk which are then evaluated, as a separate 
exercise, in terms of claim frequency and claim costs. The paper stresses the idea of interactive 
post-processing, or evaluation, of the patterns that are illuminated by traditional data mining tools. 

Modelling the KDD Process 

Graham J. Williams and Zhexue Huang 
February 1996 

CSIRO DITData Mining Technical Report TR-DM-96013 



Knowledge Discovery in Databases (KDD) is the process of extracting novel information and 
knowledge from large databases. This process consists of many interacting stages performing 
specific data manipulation and transformation operations with an information flow from one stage 
onto the next (and often back into previous stages). The process can be very complex and may 
exhibit much variety in the context of the variety tasks undertaken within KDD. In this paper we 
characterise our experiences of the KDD process and formalise its key elements in a model. A 
case study of insurance risk analysis for policy premium setting is used to illustrate the process 
and the model. The model provides a framework for comparing and differentiating various 
approaches to KDD. 

Inducing and Combining Multiple Decision Trees 

[280K gzip Postscript] 

Graham J. Williams 

PhD Thesis, Australian National University, 
Canberra, Australia, 1990 



Most activities in our daily life require us to make decisions, many subconsciously. Knowledge is 
the key to correct decision making. Its representation and use by machine has been a major goal 
throughout the history of computing machinery. Learning is one of the most important 
components of intelligence and is a crucial aspect of knowledge-based systems. The research 
reported on here focuses on the acquisition of decision trees and their transformation to rules. A 
well-established practical tool for machine learning (ID3) is used as a basis for an approach to 
building, and then combining, multiple decision trees. 

Combining Decision Trees: Initial results from the MIL algorithm 

Graham J Williams 

Artificial Intelligence Developments and Applications 
edited by J. S. Gero and R. B. Stanton 
Elsevier Science Publishers 
1988, Pages 273-289 

Some Experiments in Decision Tree Induction 

Graham J Williams 



Abstract 



Abstract 



http://research.cmis.csiro.au/gjw/publications.html 



4/30/04 



Graham Williams: Publications Page 8 of 10 



Australian Computer Journal 

1987, Volume 19, Number 2, Pages 84-91 



Spatial Reasoning 



Design of Decision Support Systems as Federated Information Systems 

D. J. Abel, Kerry Taylor, Gavin Walker, and Graham Williams 
Decision Support Systems for Sustainable Development 
Edited by Kersten, Mikolajuk, and Yeh 
Kluwer Academic Publishers, 1999 

Templates for Spatial Reasoning in Responsive GIS 

[8K gzip Postscript, first two pages only] 

Graham J Williams 

International Journal of Geographical Information Systems 
1995, Volume 9, Number 2, Pages 117-131 

Abstract 

Responsive geographical information systems (GIS) address the needs of the decision maker 
working in a spatially oriented environment where data is regularly updated, where the data is 
often voluminous, incomplete, and noisy, and where timely decisions must be made. Such 
environments stretch the capabilities of traditional GIS. A responsive GIS must play a more active 
role in the support of the decision maker. This paper introduces the concept of a responsive GIS 
and demonstrates the integration of artificial intelligence techniques and object-oriented database 
technology to provide such active support. Expert knowledge, represented as Templates, can have 
both spatial and temporal components, and remains within the GIS framework rather than 
providing separate, and often disjoint, GIS and Expert System modules. 

Representing Expectations in Spatial Information Systems 

Graham J Williams and Steve G. Woods 

Advances in Spatial Databases: Third International Symposium, SSD '93 
Edited by D. J. Abel and B. C. Ooi 

Lecture Notes in Computer Science, Volume 692, Springer- Verlag, 1993 
GEM: A Micro-Computer Based Expert System for Geographic Domains 
Graham J. Williams, J. Richard Davis and Paul M. Nanninga 

Proceedings of the Sixth International Workshop and Conference on Expert Systems and Their 

Applications 

Avignon, France, 1986. 
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The Design of Expert Systems for Environmental Management 

J. Richard Davis, Paul M. Nanninga and G. J. Williams 
Readings in Australian Geography 
Proceedings of the 21st IAG Conference 
Perth, Australia, 1988 

Geographic Expert Systems for Resource Management 

J. Richard Davis, Paul M. Nanninga and G. J. Williams 

Proceedings of the First Australian Conference on Applications of Expert Systems 
Sydney, Australia, 1985 



Databases and Decision Support Systems 



The Design of Decision Support Systems as Federated Information Systems 
D. J. Abel, K. L. Taylor, G. C. Walker, and G. J. Williams 
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Abstract 

This paper presents an introduction to frames-based representation schemes for use in the 
construction of rule-based expert systems. The features that are relevant to such expert systems 
are discussed, followed by an example of the type of rule application mechanism that the system 
implements. Advantages of such a system are discussed. 
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